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1. Introduction 

Let s be a fixed positive integer and {et)i<t<T+s a sequence of independent real random 
vectors, where et = (ejt)i<i<p has independent coordinates satisfying Ken = 0 and Ee| = 1. 
Consider the so-called lags sample autocovariance matrix of {st) defined as 

^ s-\-T 

£t£j-s ■ ( 1 - 1 ) 

t=s-\-l 

Motivated by their application in high-dimensional statistical analysis where the dimen¬ 
sions p and T are assumed large (tending to infinity), spectral analysis of such sample 
autocovariance matrices have attracted much attention in recent literature in random ma¬ 
trix theory. For example, perturbation theory on the matrix Xt has been carried out in 
Lam and Yao (2012) and Li et ah (2014) for estimating the number of factors in a large 
dimensional factor model of type 

Ut = A/t + et + ii) (1-2) 

where {yt} is a p-dimensional sequence observed at time t, {ft} a sequence of m-dimensional 
“latent factor” (m p) uncorrelated with the error process {e:*} and /i G is the general 
mean. Since Xt is not symmetric, its spectral distribution is given by the set of its singular 
values which are by definition the square roots of positive eigenvalues of 

At := XtX^ . (1.3) 

To our best knowledge, all the existing results on Xt (or At) are found under what we 
will refer as the Marcenko-Pastur regime, or simply the MP regime, where 

p —)■ cx), T —)■ oo and p/T s c> 0 . (1.4) 

For example, Jin et al (2014) derives the limit of the eigenvalue distributions (ESD) of the 
symmetrized auto-covariance matrix ^(Xt + Xfl); and Wang et al. (2013) establishes the 
exact separation property of the ESD which also implies the convergence of its extreme 
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eigenvalues. For the singular value distribution of X^, the limit (LSD) has been established 
in Li et al. (2013) using the method of Stieltjes transform and in Wang and Yao (2014) 
using the moment method. The latter paper also establishes the almost sure convergence 
of the largest singular value of Xt to the right edge of the LSD, thanks to the moment 
method. Related results are also proposed in Liu et al. (2013) where the sequence {et) is 
replaced by a more general time series. 

In this paper, we investigate the same questions as in Wang and Yao (2014) but under 
a different asymptotic regime, the so-called ultra-dimensional regime where 


p —)■ oo, T —)■ cx) and p/T —)■ 0. 


(1.5) 


It is naturally expected that the limit under this regime will be much different than under 
the MP regime above. The hndings of the paper conhrm this difference by providing a new 
limit of the singular value distribution of Xt under the ultra-dimensional regime. 

In a related paper Wang et Paul (2014), the authors also adopted the ultra-dimensional 
regime to derive the LSD for a large class of separable sample covariance matrices. However, 
the autocovariance matrix X^ considered in this paper is very different of these separable 
sample covariance matrices. 

Recalling the dehnition of At in (1.3), we have 

^ p T r 

At{'Ii j) rp2 YEE j n-\-s^ln • 

l=\ m=l n=l 

It follows by simple calculations that 


and for i ^ j, 


jo, j , 

|p/T, i = j , 


VarHr(h j) = IEH|(i, j) 


]L 

J'2 ■ 
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The row sum of the variances Vax Axii, j) is thus of order Therefore, in order to 

have the spectrum of At be of constant order when p/T —)■ 0, we should normalise it as 


A : = 


An 


VWlr- 


T 

p' 


= -XtX^ 


( 1 . 6 ) 


The main results of the paper are as follows. First in Section 2, we derive the almost sure 

It 

limit of the singular value distribution of 4 / —Xt under the ultra-dimensional regime and 

V P 

assuming that the fourth moment of the entries {sa} are uniformly bounded. This limit 
(LSD) simply equals to the image measure of the semi-circle law on [—2, 2] by the absolute 
value transformation x h-)■ |x|. Next in Section 3, we establish the almost sure convergence 

[t 

of the largest singular value of 4 / —Xt to 2 assuming that the entries {ea} has a uniformly 

V P 

bounded moment of order A-fp for some z/ > 0. Both results are derived using the moment 
method. Some technical details on the traditional truncation and renormalisation steps are 
postponed to the appendixes. 


2. Limiting spectral distribution by the moment method 

In this section, we show that when p/T —)■ 0, the ESD of the singular values of ^J~^Xt 
tends to a nonrandom limit, which is linked to the well known semi-circle law. 

Theorem 2.1. Suppose the following conditions hold: 

(a) . {st)t is a sequence of independent p-dimensional real valued random vectors with 

independent entries 1 < f < p, satisfying 

E(e:jt) = 0, Ee| = 1, supE(e:fJ < oo . (2.1) 

it 

(b) . Both p and T tend to inhnity in a related way such that p/T —)■ 0. 

Then, with probability one, the empirical distribution of the singular values of ^^J^Xt 
tends to the quarter law G with density function 

g{x) = —\/4 — , 0 < X < 2 . (2.2) 

71 
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Remark 2.1. Recall that the quarter law G is the image measure of the semi-circle law 
by the absolute value transformation. It is also worth noticing that if there were no lag, 
i.e. s = 0, the matrix Xt would be a standard sample covariance matrix; and in this case 

It 

the spectral distribution of \ —(Xt — Ip) would converge to the semi-circle law, see Bai 

V P 

and Yin (1988). The case of a auto-covariance matrix Xt with a positive lag s > 0 is then 
very different. 

Since the singular values of are the square roots of the eigenvalues of , 

in the remaining of this paper, we focus on the limiting behaviours of the eigenvalues of 
IXtX^^. These properties can then be transferred to the singular values of y^X^ by the 
square-root transformation x h->■ i/x. 

Theorem 2.2. Under the same conditions as in Theorem 2.1, with probability one, the 
empirical spectral distribution of the matrix A in (1.6) tends to a limiting distribution 
F, which is the image measure of the semi-circle law on [—2, 2] by the square transformation. 
In particular, its k-th moment is: 


rrik = 


k 


2k 

k-1 


and its Stieltjes transform s(^) and density function /(x) are given by 

and 


71 \ X 4 


0 < a; < 4 , 


(2.3) 


(2.4) 


(2.5) 


respectively. 


Remark 2.2. The k-th moment in (2.3) is exactly the 2fc-th moment of the LSD of a 
standard Wigner matrix, which is also the number of Dyck paths of length 2k (for the 
dehnition of Dyck paths, we refer to Tao (2012)). Notice also that the density function / 
is unbounded at the origin. 
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The remaining of the section is devoted to the proof of Theorem 2.2 using the moment 
method. The k-th moment of the ESD of A is 

1 T p ^ 

T^k{A) = - tr y4 = fc+ijifc ^31 iFji s+i 2 ^i 2 s +*3 

^ i=l j=l ^ 


^j2k-l *2fc-l^i2fc-l *2fc^i2fe S+*2fe^i2fc S+U 


( 2 . 6 ) 


Here, the indexes in i = (H, • ‘ ‘ > ‘i' 2 k) run over 1, 2, • • ■ , T and the indexes in j = (ji, • • ■ , j 2 k) 
run over 1,2, ■ ■ ■ ,p. 

The core of the proof is to establish the following two assertions: 


(I). 


1 

k 


2k 


k - 1 


(II). E Var(mfc(24)) < oo . 

p=i 


fc > 0; 


This is given in the Subsections 2.1, 2.2 and 2.3 below. It follows from these assertions that 
almost surely, mk{A) —>■ rrik for all k > 0. Since the limiting moment sequence {rrik) clearly 
satisfies the Carleman’s condition, i.e. ~ deduce that almost surely, 

the sequence of ESDs weakly converges to a probability measure F whose moments are 
exactly {rrik)- Next, notice that is exactly the number of Dyck paths of length 2k (Tao , 
2012), which is also the 2fc-th moment of the semi-circle law with support [—2, 2], it follows 
that the LSD F equals to the image of the semi-circle law by the square transformation 
X —>■ The formula in (2.4) and (2.5) are thus easily derived and the proof of Theorem 2.2 
is complete. 


2.1. Preliminary steps and some graph concepts 

We now introduce the proofs for Assertions (I) and (II). First we show that with a 
uniformly bounded fourth order moment, the variables {Sit} can be truncated at rate 
for some vanishing sequence y = y{T). This is justified in Appendix A. After these 
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truncation, centralisation and rescaling steps, we may assume in all the following that 

E(£ij) = 0, = 1, \eij\ < , (2.7) 

where rj is chosen such that r; —?■ 0 but —)■ oo. 

Now we introduce some basic concepts for graphs associated to the big sum in (2.6). Let 

- 0 ( 61 , •• • , Cm) := number of distinct entities among Ci, • • • , , 

i := (^1) • • • ) *2fc), j := (ji, • ■ ■ ,j2k), 

1 < < T, I <jb<P, a,b = 1, - ■ ■ ,2k, 

A{t, s) := {(i, j) : ^/>(i) = t, '^(j) = s} . 

Dehne Q{i,j) as the multigraph as follows: Let /-line, J-line be two parallel lines, plot 
- ,'t 2 k on the /-line, ji, • • ■ ,j 2 k on the J-line, called the I-vertexes and J-vertexes, 
respectively. Draw k down edges from i 2 u-i to j 2 u-i, k down edges from + s to j 2 u, k 
up edges from i 2 u-i to i 2 «, k up edges from j 2 « to i 2 u+i + s (all these up and down edges 
are called vertical edges) and k horizontal edges from i 2 u to i 2 u + s, k horizontal edges from 
* 2 n-i + s to i 2 u-i (with the convention that i 2 k+i = h), where all the u’s are in the region: 
1 < u < k. An example of the multi-graph Q{i,j) with A: = 3 is presented in the following 
Figure 1. 

I-line(l---T) 

J-line (I---p) 

i(< it — is ii — 74 73 

Figure 1: An example of the multigraph Q{i,j) with k = 3. 

In the graph Q{i,j), once a /-vertex p is fixed, so is p -|- s. For this reason, we glue all 
the /-vertexes which are connected through horizon edges and denote the resulting graph 
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as M{A{t, s)), where A{t, s) is the index set that has t distinct /-vertexes and s distinct J- 
vertexes. An example of M(A(3,4)) that corresponds to the Q{i,j) Figure 1 is presented 
in the following Figure 2. 

h = h h = h 13 = 4 

I-lme(I--T) 

J-line (I--- p) 

Je Ji =75 J 2 =74 73 

Figure 2: An example of M(A(3,4)) that corresponds to the Q{i,j) in Figure 1. 



2.2. Proof of Assertion (I) 

Recall the expression of mk{A) in (2.6), we have 

T p 

Emk{A) = j] 

i=l j=l 

^72)c- 1 *2fc-1^72)c-l *2fe^i2fe S+i2k^j2k •*+*1 J 

= E p(p-l)---(p-B + l)T(T-l)---{T-t+l) 

t,s F M{A{t,s)) 

■ ,2^j2S+j2^J2 3+!3 ' ' ' 1 *22 ^J22 32+*22^22* S+*l ] 

:=5^S((,s), (2.8) 

t,s 

where 

E p(p-i)---(p-** + imT-i)...(r-( + i) 

^ M(A(t,s)) 

■ E \ej-^ *2 ■ ■ ■ ^j2k S+i2k^j2k s+p] • (2-9) 

Then we assert a lemma stating that \S{t, s)| —?• 0 except for one particular term. 


k+lj^k [^7l *1^71 *2^72 S+*2^72 ^+*3^73 *3^73 *4^74 S+*4^74 ^+*5 
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Lemma 2.1. s)| — 0 as p —)• oo unless t = k and s = k -\-l. 

Suppose Lemma 2.1 holds true for a moment, then according to (2.8) and (2.9), we have 

ErukiA) = S{k, k + l) + o(l) = E[-] ■ #{M{A{k, k + 1))} + o(l) , (2.10) 

where E[-] refers to the expectation part in (2.9) and ^{M{A{k, k + 1))} refers to the 
number of isomorphism class that have k distinct /-vertexes and k + 1 distinct J-vertexes. 
First, we show the expectation part E[-] equals 1 when t = k and s = k -\-l. Let Vm denote 
the number of edges in M{A{t, s)) whose degree is m. Then we have the total number of 
edges having the following relationship: 

Vi -\- 2 v2 -f- • • • -f- 4:kv4k = 4:k . (2-11) 

Since we have Esij = 0 in (2.7), all the multiplicities of the edges in the graph M{A(t, s)) 
should be at least two, that is vi = 0. On the other hand, M{A{t, s)) is a connected graph 
with t -\- s vertexes and ui • -1- v^k (= ^2 -f ■ ■ ■ -f v^k) edges, we have when t = k and 
s = k -fl: 

2k -|-l = t-t-s<'yi-F----l- V4k + 1 = ^2 -h ■ ■ ■ -[- V4k A 1 

< — (2'i; 2 + Sus -f- ■ ■ ■ -|- 4:kv4k) -l- 1 = 2k -\-1 , (2.12) 

where the last equality is due to (2.11) with vi = 0. Then we have all the inequalities in 
(2.12) become equalities, that is, 

1^2 + • • • + V4k 1 = —{2v2 + 3^3 4:kv4k) -|- 1 = 2A: -|- 1 , 

which leads to the fact that 

Vs = V4 = ■ ■ ■ = V4k = 0, V2 = 2k . (2.13) 

This means that all the edges in the graph M{A{k,k -\- 1)) is repeated exactly twice, so 
the part of expectation 

*2 ■ ■ ■ s+*l] ^ ■ (2.14) 
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Second, the number of isomorphism class in M{A{t,s)) (with each edge repeated at least 
twice in the original graph Q{i,j)) is given by the notation ft-i{k) in Wang and Yao 


(2014), where 




1 / 2k \ (k 
k 


t-lj \t 

Therefore, in this special case when t = k and s = fc + 1, we have 


\ I 2k 

#{M(A(A;,fc + l))} = /,_i(fc) = - 


k 


k-l 


Finally, combine (2.10), (2.14) and (2.15), we have 


(2.15) 


I 2k ^ 

Emk{A) = ^ I I + ^(1) • 


Assertion (I) is then proved. 


It remains to prove Lemma 2.1. 

Proof, (of Lemma 2.1) Denote bi as the degree that associated to the /-vertex ii {1 < I < f) 
in M{A{t, s)), then we have bi bt = 4fc, which is the total number of edges. On the 

other hand, since each edge in M{A{t, s)) is repeated at least twice (otherwise, there exist 
at least one single edge, so the expectation will be zero), we have each degree bi at least 
four (we glue the original /-vertexes p and b + s in M{A{t, s))). Therefore, we have 

Ak = bi + ■■■ + bt > At , 

which is t < k. 

Now, consider the following two cases separately. 


Case 1: s > A; -|- 1. 

Recall the dehnition of Vm in (2.11), which satishes that 

Vi + 2v2-\ -h Akv4k = 2 v2-\ -h Akv^k = Ak 
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and 


t + s < ni + ■ • ■ + + 1 — 1^2 + ■ ■ ■ + + 1 • 


We can bonnd the expectation part as follows: 


h^jl i2^j2 S-\-i2^j2 s+is ' ' ' ^j2k S+*2fe^j2fe «+il] 

2 |'*^2 I TT7' I^ / ..rnl /4\-|-( 4 fc—2)1^4^ 




3ii3+4j; 4H-|-4fci;4fc-2(j;3+j;4H-h'y4fc) 


= W''*)' 

_ - \-V4,k) ^ 2(i+s—1) 


(2.16) 


Then we have according to (2.9) that 

-r/4^-2h+^-i)#{M(24(t,s))} 


pk+lj'k' 
pS-k-l 


^5 —/C—1 

= o I h . 


(2.17) 


where the last eqnality is dne to the fact that s))} is a fnnction of k {k is hxed), 

which could be bounded by a large enough constant. 

Since s > fc + 1 and t + s — 1 < 2fc, then 

, s t 1 s , t 1 1, 

2 2 2 2 2 2 2^ J - ^ 

which is 

0<s — fc — 1< -(s — f — 1) . 

So, (2.17) reduces to 

|S(*.*')l <0 ((f) ^ 0 . ( 2 . 18 ) 

which is due to the fact that s — A: — 1 > 0 and p/T —0. 
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Case 2: s < fc + 1, but not t = k and s = k -\-1. 

For the same reason as before, we have t distinct /-vertexes, each degree is at least four, 
so we have another estimation for the expectation part: 

^jl il^jl S+i2^j2 ■ ■ ■ ^j2k S-\-i2k^hk «+*! 

Therefore, 


< , (2.19) 




/ \ 


which is also due to the fact that if{M[A{t, s))} = 0(1). 
Case 2 contains three situations: 


( 2 . 20 ) 


(!)■« 

(2) .f 

(3) .( 


= k and s < k -\- 1 : | S'(t, s) | < O 

< k and s = k -\-1 ■. | S'(f, s) | < O 

< k and s < k -\-1 ■. | S'(f, s) | < O 


,fc+l—s 


P‘ 


0 ; 


^ 0 ; 


V 


Ak—At 


,fc+l—s 


P' 


0. 


Combine (2.18) and (2.21), we have |S'(f, /c)| —)■ 0 as p —)■ oo unless 


t = k 

s = k -\-l . 


( 2 . 21 ) 


□ 


2.3. Proof of Assertion (II) 
Recall 


Var(mfc(24)) 

p2k->r2J'2k ~ (^Q(i2,j2))] • (2.22) 

ii,ji>i2,j2 
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If ( 5 (ii, ji) has no edges coincident with edges of <5(12^2), then 

IE (^Q(ii,ji)^Q(i2,j2)) IE IE (^<3(1202)) HI 

by independence between and £Q(i2,j2)- If Q = Q(ii, ji) U ^^(12, j2) has an overall 

single edge, then 

IE (^Q(ii,ji)^Q(i2,j2)) ~ IE (^Q(ii,ji)) IE (^Q(i2,j2)) ~ H) , 

SO in the above two cases, we have Var(mfc(yl)) = 0. 

Now, suppose Q = Q(ii, ji) U Q(i2, j2) has no single edge, Q(ii,ji) and Q(i2,j2) have 
common edges. Let the number of vertexes of Q(ii,ji), Q(i2,j2), Q = Q(ii,ji) UH?(i2,j2) 
on the /-line be ti, ^2, t, respectively; and the number of vertexes on the J-line be Si, S2, s, 
respectively. Since ( 5 (ii, ji) and Q(i2, j2) have common edges, we must have t +^2 — 1 , 
S < Si -1- S2 - 1 . 

Similar to ( 2 . 16 ) and ( 2 . 19 ), we have two bounds for |E (£Q{ii ji)£Q(i2,j2)) |- 

|E(£0(i.J.)£0(hi))l < , (2.23) 

or 

|E (£0(1. J.)£Q(h J.)) I < . (2.24) 

For the same reason, we have also 

Itit' 117 \ ^ ( r7-il/4'\^^“2(0+Sl —l)+4fc—2(t2+S2—1) 

FeQ{h,jO®^^Q(i2,j2)l <W') 

< , ( 2 . 25 ) 

or 

|E (eg(ujo^Q(i2.j2)) I < , ( 2 . 26 ) 

where the last inequalities in ( 2 . 25 ) and ( 2 . 26 ) are due to the fact that t < ti + ^2 — 1 , 
s < Si -|- S 2 — 1. 
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Since 


Var(mfc(A)) 

^ t,s M{A{t,s)) 

:=5^S(«,s). (2.27) 

t,s 


Using (2.23), (2.24), (2.25) and (2.26), we can bound the value of |S'(t,s)| as follows: 


= 0 


p 


s—2k—2 


2^5/2-t/2-l/2 


(2.28) 


or 


\S{t,s)\<0 



= 0(p^-2fc-2) _ 


(2.29) 


Clearly, 


fi + Si < ‘Ik + 1 , ^2 + 52^ ‘2k + 1 ; 


we have thus 


t-l-S^tl~l“t2 — l“l“Sld"S2 — 1 ^ 4/c . 


First, consider the case that s > t -\-1 where we use the bound in (2.28). Since 
s-2k-2-s/2 + t/2 + 1/2 = s/2 + t/2 -2k- 3/2 < -3/2 , 


which leads to 


s-2k-2< -3/2 + s/2 - t/2-1/2 . 
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Combine with (2.28), we have 

|§((.s)| < O < O . (2.30) 

Second, we use the bound in (2.29) for the case s < t + 1. Recall that t + s < 4/c, we have 


s — l + s<t + s<4/c. 


which is 


2s - 1 < 4fc . 


Then, from (2.29), 

\S{t,s)\ < O < O = o (p-3/2) . (2.31) 

Combine (2.27), (2.30) and (2.31), we have 

I Var(mfc(24))| < , 

which is summable with respect to p. Assertion (II) is then proved. 

3. Convergence of the largest eigenvalne of A 

In this section, we aim to show that the largest eigenvalue of A tends to 4 almost surely, 
which is the right edge of its LSD. 

Theorem 3.1. Under the same conditions as in Theorem 2.1, with supjjE(efJ < cx) in 
(2.1) replaced by sup^^ E(|e:jt 1^+^) < oo for some u > 0, the largest eigenvalue of A converges 
to 4 almost surely. 

Recall that in the proof of Theorem 2.2, a main step is Lemma 2.1, which says that 
\S{t, s)| —)■ 0 except for one term, which is when t = k and s = /c +1. One thing to mention 
here is that in order to prove this lemma, k is assumed to be hxed. Then the number of 
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isomorphism class in M{A{t, s)) is a function of k, thus can be bounded by a large enough 
constant. So actually, we do not need to know the value of ^{M{A{t, s))} exactly. While in 
the case of deriving the convergence of the largest eigenvalue, k should grow to inhnity, so 
we can not trivially guarantee that the number of isomorphism class in M{A{t, s)) is still 
of constant order. Therefore, the main task in this section is to bound this value, making 
\S{t, s)| {t ^ k or s ^ k-\-l) still a smaller order compared with the main term \S{k, +1)| 
when fc —)■ oo. 


Proposition 3.1. Let the conditions in Theorem 2.1 hold, with supj^E(£^^) < cx) in (2.1) 
replaced by supj^ < oo for some u > 0, and k = k{p,T) is an integer that tends 

to inhnity and satishes the following conditions: 

' 

k/ logp —)■ CXD, 

< kp/T 0, (3.1) 

k/p —)■ .0 

Then we have 


E(mfc(A'r)) = ^ 


2k 


k-l 


(1 + Ofc(l)) . 


Now suppose the above Proposition 3.1 holds true. We hrst show it will lead to Theorem 

3.1. 


Proof, (of Theorem 3.1) Using Proposition 3.1, we have the estimation that 


¥.{mk{A)) = i 


2k 


k-l 


(1 + Ofc(l)) , 


(3.2) 


then for any A > 0, we have 


P{li > 4 + A) < P(tr > (4 + A)^) < 


Etr 

(4 + A)^ 


p-E(mfc(A)) 

(4 +A)^ 
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< P 1 f 

-(4 + A)^-^^_l 


■ (1 + Ofc(l)) A 


/ 4pi/fc 

V4 + A 


■ (1 + Ofc(l)) • 


(3.3) 


The right hand side tends to ( 4 ^)^ since k/logp —t 00 (so —)■ 1). Once we £x this 

A > 0, (3.3) is snmmable. 

The npper bonnd for /i is trivial dne to onr Theorem 2.2. □ 


Now it remains to prove onr Proposition 3.1. 

Proof, (of Proposition 3.1) After trnncation, centralisation and rescaling, we may assnme 
that the e^s satisfy the condition that 


E(£i,) = 0, Var(ejt) = 1, |ea| < , 


(3.4) 


where 6 is chosen snch that 


5^0 

^ 0 

< 5TV2^oo 


s'^kVr 0 

kp 


S^T 


—)■ OO . 


More detailed justifications of (3.4) are provided in Appendix B. 
From the proof of Theorem 2.2, we have 


(3.5) 


1 / 2 A; \ 

Emk{A) = 'Y] S{t, s) = S{k,k + 1) + o(l) = - + o(l) , 

A ^v-i/ 

where S{k, fc + 1) is the main term that contributes to Emfc(A), while all other terms can 
be neglect. Therefore, it remains to prove that when k ^ oo, we still have 


t^k or s^k-\-l 


1 

k 


I 2k 

1 


■ Ofc(l) • 


We also consider two cases: 
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Case 1 : s > A; + 1 

Case 2 : s < k -\- 1, but not t = k and s = k -\- 1. 
Similar to (2.16) and (2.19), we have two bounds for the expectation part: 

^jl h^jl S+i2^j2 S+i3 ■ ■ ■ ^j2k S+*2fe^i2fc «+*l 



< 


( 52 - 1 / 2 ^ 


4A:-2(t+s-l) 


(3.6) 


or 


]E[£ji S+i2^i2 s+ia ■ ■ ■ ^j2k S-\-i2k^hk s+n] 




4fc-4t 


(3.7) 


Consider t = 1 hrst. From Wang et ah (2013), the number of isomorphism class s))} 

is bounded by 

2k 

2k — s ^ 

and combine this with (2.9) and (3.6), we have 




Ak-2s j 2fc 

2k — s 


(3.8) 


Then, 


2k 




^ pk+lT 
1 


4k-2s j 2A: 

2k — s 


2k 


S = 1 ^ 


4fc-2s 


2k 


(3.9) 


The right hand side of (3.9) can be bounded as 


2k 




6^T 


which is dominated by the term when s = 2k since ^ —)■ cx). Then (3.9) reduces to 


S^T 

-JP- - ( 1 )“' - 0 


pk+lj' 


(3.10) 
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Next, we consider Case 1 and Case 2 (when t > 1) separately. According to Wang et al. 
(2013), the number of isomorphism class in M{A{s,t)) [t > 1) is bounded by 




2k — t 


s — 1 


(3.11) 


where 


k 


1 / 2k 
ft-m = - 

^ t- 1 / \t 


Case 1 (s > /c + 1 and t > 1): The part of expectation can be bounded by (3.6), and 
combining this with (2.9) and (3.11), we have 

1 


\S{t,s)\ < 


pk+lj'k 


j 


M{A{t,s)) 


^ ^ P ' ' ' ^j2kS-{-i2k^j2kS-\-ii] 


4fc-2(t+s-l) „ /, 3 ' ~ ^ 


\ s _ 1 

Since s > A; + 2, f > 2, and a trivial relationship that f + s — 1 < 2k, we have 

k 1 2Ai-|-l t _ -f 


(3.12) 


t,s t=2 s=k-\-2 ” 

The summation over s in (3.13) can be bounded as follows: 

2k-\-l-t I 2/^ _ ^ \ 2k+l-t 


(3.13) 


s — 1 




< 


s=k-\-2 


5 — 1 


s=k-\-2 


2kp 

Pf 


(3.14) 


and since —)■ oo, the summation in (3.14) is dominated by the term of s = 2/c + 1 — f. 


Therefore, (3.13) reduces to 


k 1 _ f mk 

^ / rTnl/2\"‘^-2h+2fc+l-i-l) n /,n 

pk+iTk 


t=2 

k-1 


2k — t 

2k + l-t-l 


t=2 


k-l 


E(f)"'/-w = E 


t=2 


2k 


t - 1 


k \ / p \ 

t ' 


(3.15) 
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For the same reason, the right hand side of (3.15) inside the summation can be bounded 
by 

j ©*(”)'. 

and since Tk'^/p = —)■ cx), the dominating term in (3.15) is when t = k — 1, which 

reduces to 


1 / 2k 
k 


k 


k — 2 / \ k — 1 


Since kp/T —)■ 0, we have (3.16) equals 


k{k — l)p 1 / 2k 


k + 2 T k 


k-l 


(3.16) 


1 

k 


2k 

k-l 


Ofc(l) 


(3.17) 


Therefore, in this case, we have 


E 


t.s 


11 2k 
k-l 


Ofc(l) 


(3.18) 


Case 2 {2 < t < k and s < k -\- 1): For the same reason, combining the bound of the 
expectation part in (3.7) with (2.9) and (3.11), we have 

1 

pk 

M{A{t,s)) 


pk+lj'k P *2 ■ ■ ■ ^i2fe S+i2fe^i2*; S+*l] 


< 


1 




pk-i-l'pk 


2k — t 


s — 1 


Therefore, we have 


k fc+l 


2k — f 

£,s t=2 s=l \ S — 1 

We also consider the following three situations: 


(3.19) 


(3.20) 


(1). t = k and s < k -\-1 , 
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(2) . 1 < t < k and s = fc + 1 , 

(3) . 1 < t < k and s < + 1 , 

and show that for all the above three sitnations, we have (3.20) bonnded by 


I 2k 


k 


fc- 1, 


■ Ofc(l) 


For sitnation (1), (3.20) rednces to 

^ ^ , ( k \ , ,1 I 2k 

\ S-1 J S=1 


s=l 


k 


k-l 


k 

S — 1 


which can be bonnded as 


I I (kpr 


S=1 


k 


k-l 


Therefore, the dominating term is when s = k, thns (3.21) rednces to 


1 2k \ k 1 2k 


k 


k-l, 


P k \ n_ 


Ofc(l) , 


k-l 


which is dne to the choice of k that fc/p —?• 0. 
For sitnation (2), (3.20) rednces to 


fe-i 

t=2 

k-l 




2k — t 
k 


t=2 


Ak—Atrpk—t 


2k 


t - 1 


k 

t 


2k — t 
k 


Since the right hand side of (3.22) can be bonnded by 


k-l 

E^ 

t=2 


Ak 


{2kTf [2k 


k \5^T 


.2 \ i 


(3.21) 


(3.22) 


(3.23) 
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which is dominated by the term oi t = k — 1 since —>■ 00 . Therefore, we have 

(3.22) bounded by 


- 


1 / 2A; 


k — 2 I \ k — 1 


k -\-1 
k 


= 6^T 


4^(^-l)^(^ + l) 1 I 


k + 2 


k-1 


I 2k ^ 

l-Ofc(l), 


k 


k-1 


which is due to the fact that d^Tk"^ = (S'^y/Tk)'^ —?• 0. 
For situation (3), we have (3.20) reduce to 


k—1 k 




2k — t 


t=2 s=l 
k—1 k 




—t srik—At 


t=2 s=l 


S — 1 

2k \ ! k 
t-l ] \ t 


The part of summation over s is 


S = 1 


P 


2k — t 


s — 1 


which could be bounded by 


Y^{‘^kpy , 


S = 1 


2k — t 
s — 1 


therefore, the dominating term is when s = k. So (3.24) reduces to 


k-1 


^‘^k-Atrpk-t 


t=2 


I I 2k 
t - 1 


k \ / 2k — t 
t j \ k-1 


(3.24) 


(3.25) 
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For the same reason, the right hand side of (3.25) can be bounded by 


k-l 


t=2 


k \T6* 


which is dominated by the term oi t = k — 1 since —t oo. Therefore, (3.25) 


(5T1/4) 


reduces to 


T . 1 / 2/c \ k \ k + 1 \ S^k^T 1 / 2/c 

p^'^U-2 U-i U-i 


(3.26) 


and since = [5‘^k\/TY ■ k/p —)■ 0, we have (3.26) equals 


2k 

k-l 


Ofc(l) 


Finally, in all the three situations, we have 


E ■5'(‘. 


t.S 


2k 

k-l 


Ofc(l) 


The proof of Proposition 3.1 is complete. 


□ 


Appendix A: Justification of truncation, centralisation and rescaling in (2.7) 
A.l. Truncation 

Dehne two p x T matrices 

El := (£i £2 - ■ ■ £t-i £t) , E2 := (^s+i £3+2 ■ ■ ■ ^s+T-i ^s+r) , (^- 1 ) 

then 

W = y E = fE,Ef , (A.2) 

t=s-\-l 
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and our target matrix 


Let 


A = -XtX^ = \e2E'Ie^eI 
p pT 




Xt and A are defined by replacing all the Sij with in (A.2) and (A.3). 
Using Theorem A.44 in Bai and Silverstein (2010) and the inequality that 

rank(Ai? — CD) < rank(A — C) + rank(i? — D) , 

we have 


(A.3) 


E^{x)-E^{x) = 


Xxrxf, 


< -rank ( \ —Xt — \ —Xt | = -rank ( Xt — Xt 

P \\l P \l P J P 

^ , / 1 „ ^T ^ TP TpT\ _ ^ _ I TP tpT 


= -rank y—E 2 El - —E 2 E{j = -rank [E 2 E{ - A' 2 -Ef 

< -rank (E 2 — E 2 '] + -rank (Ei — Ei 

p \ / p \ 

2 / \ 2 

= -’^ank (El-El) 


P 

Since supjjE(£)'j) < 00 , we have always 
1 


P t=i ,=i 


p^pT ^ 






'(|£iq>r,Tl/4) 


j —> 0 as p, T —)■ cx) . 


Consider the expectation and variance of ^ Yl^=i 

p T 


® EEi 


P 


{\sii\>riTY^} ^ 


i=l j=l 
p T 


< - 


„E.i: 

^ i=i j=i 






Var - EE l{|£i,|>r?Ti/4} < EE 


i=l j=l 


4 ■ l{|£i,|>»)Tl/4} 


P^ 


i=l j=l 


p^T 


(A.4) 


= 0 ( 1 ) , 


= o(-). 
P 
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Applying Bernstein’s inequality, for all small e > 0 and large p, we have 


^ 1 P >e] <2e 2 ^ ^ . 


p T 


(A.5) 


i=i j=i 

Finally, combine (A.4), (A.5) with Borel-Cantelli lemma, we have with probability 1, 

F^(x)-F^(x) ^0. 


A.2. Centralisation 


Let 


Xt and A are dehned by involving the iijA in (A.2) and (A.3). 
Similar to (A.4), we have 


F^{x) - F^{x) 


< -rank 4 / —Xt 
p \y p 


Fxr 

P 


= -rank (^E 2 Ef — < -rank (^Ei — 

2 - 2 

= -rank(E(F'i)) =- )-0, asp—)-cx). 

p \ ^ p 


Therefore, we have 


F^{x) - F^{x) 


0 


A.3. Rescaling 


Let 


^ij 1 


then for the same reason as (A.4), we have 
F^{x) - F^{x) 


< -rank (E 2 — E 2 '] + -rank {Ei — Ei 
p V / p 
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2 / ~ ^ \ 2 / 1 
= -rank [ Ei — Ei] < - max 1- 


P 


p 


a. 




■ rank (Ei 


< - max [ 1-) min{p, T} 


p 


cr,; 


= O ( I 


max 1- 


Since 


\ \ CT; 


4 = = Var(%) = Yai{eij ■ l{|e,^.|<^ri/4}) 

—)■ Var(e:jj) = 1 , as T —)■ cx) . 


Therefore, we have 




Appendix B: Justification of truncation, centralisation and rescaling in (3.4) 
B. 1. Truncation 

El, E 2 , Xt and A are dehned in (A.l), (A.2) and (A.3). Let 

Xt and A are dehned by replacing all the Sij with Sij in (A.2) and (A.3). With the 
assnmption that snp^^ E(|£ji|^+^) < 00 , we have always 

E 

snp — 

it 

Since 

A = , 

whose eigenvalnes are the same as those of 

B := ^E[EiE^E2 , 
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then we have 


max (^) = A 

max {B)-X 

max (B) 


—EfE,E^E, 


op 


—EIE,EJE2 

pi 


op 


< 


< 


\eIe,e^E2 - \eIe,eIe, 

pi pi 

—E'(EiE'^E2 - —E'(EiE'^E2 


op 


+ 


op 


—E'^EiE'^E2 - —EJEiE'^E2 


— [EiE,-EiE,]EiE2 


+ 


op 


1 

pT 


—EfE, (e^E2 - EJE2 


op 


Ji + J 2 

First, we have 


op 


(B.2) 


El El-El E 


= max x{EIE l — ElEi)x'^ 


= max 

11x11=1 


op ||x||=l 

x{ElEi - ElEi)x^ + x{ElEi - EIEi)x^ 


< max x(EIE l — ElEi)x'^ + max xiElEi — ElEAx'^ 

11x11=1 11x11 = 1 


<^11 + <^12 , 


(B.3) 


where 


= max x{EIEi — ElEi)x 

||x||=l 



p 

= max 

xiXj {ski - i 


ij 

k=l 

< max 

V 

E 

7_1 

[{^ 4 'X 


yi^\xiXj{ElEi - ElEi){i,j) 




k=l 



1/2 


E 


^k'l 


^kj 


1/2 


1/2 


E-h' E 


'fcj 


1 / 2 ' 


1/2 


2^1/2 ^ ^ ^ ^ \ 1/2 


EE' 

k=l j=l 


■kj 
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/ , P T X 1/2^ 

0( v^- (EE ^ki ■ ^{|e*ii|>(5Ti/2} J 




k=l i=l 


< O •supE(^4 ■ 


1/2^ 


{\ekA>STY^} 


1/2^ 




where the last inequality is due to (B.l). 

For the same reason, J 12 is also of the same order as (B.4). Therefore we have 


EfE, - E^E, 


< o 


op 




-1//4 


Then recall the dehnition of Ji in (B.2), where 


Ji = 

< o 


— {EiE,-EiE,)E^E, 


op 


1 

< - 
P 


EfE^-EjE, 




op 


T 


(B.4) 


(B.5) 


op 


-ip/A 


—)■ 0 , as p, T —)■ cx) , 


(B.6) 


where the last inequality in (B.6) is due to (B.5) and the fact that ^ ||i?Ji? 2 ||Qp is the 
largest eigenvalue of the sample covariance matrix ^E 2 E 2 , which is of constant order. 

For the same reason, we also have J 2 the same order as Ji, which also tends to zero. Finally, 
according to (B.2) we have 


•^max(^) max (4 


^ 0 . 


B.2. Centralisation and Rescaling 


Let 


4 = VarE4, eit = 


Sit — Ee. 


it 




imsart-generic ver. 2014/10/16 file: autocross.tex date: January 28, 2015 


















Q. Wang and J. Yao/Singular values distribution of a ultra-large auto-covariance matrix 29 
Xt and A are defined by replacing all the with in (A.2) and (A.3). In this subsection, 
we will show 


•^max(A) A 

max (Al) 


—)■ 0 , 


which is equivalent to showing 


Amax(-S) A max (B) 


0 


First, since 

sup 11 — I = sup 


ki 


= sup 

ki 


< 2■sup 

ki 


= O 


~ ® (^Eki ■ l{|£fei|<5ri/2} - • l{|£fci|<5Ti/2}j j 

• 1 {|£m|> 5 ti/ 2}) + (^IE(£fci ■ l{|£fei|>5ri/2})) 

^ 2 ■ supfciE(^|£fci|^+^ ■ 


• l{|£fci|>5rl/2} 


(5TV2) 


2+1^ 


„ 2 + 1 / 


where the last equality is due to (B.l). Finally, we have: 


sup 

ki 


1 - — 
^ki 


= sup 

ki 


^ki 1 


^ki 


= sup 

ki 


(T. 


ki 


^ki i^^ki H“ 1) 


= O ( sup - 1 

ki 


< O 


5^ 


where the last inequality is due to (B.7). 

Second, we have another estimation for the term sup^j |Edfcj| as follows: 


ki 


ki 


sup |E4i| = sup E[eki ■ l{|£,i|< 5 Ti/ 2 }] 


= sup 

ki 


■ l{|£fei|>5Ti/2}] 


^ supfc^E[|£fci|^+^ • l{|£fc,|>5ri/2}] ^ ^ 


(52-1/2)3+^ 
Then similar to (B.2), we have 

Amax(-B) A max {B) 


rp^+li- 

1 2 


(B.7) 


(B.8) 


(B.9) 


imsart-generic ver. 2014/10/16 file: autocross.tex date: January 28, 2015 































Q. Wang and J. Yao/Singular values distribution of a ultra-large auto-covariance matrix 


30 


< 


— {EiE,-EiE,)EiE, 


+ 


op 


—EiE,[EiE,-EiE2 


op 


J 3 + J 4 


Also, similar to (B.3) and (B.4), we have 
El El-El E 


= max x{EIE l — EIEi)x^ 
op lhll=i 


< max x{EIE l — ElEi)x'^ + max x{EIE i — ElEi)x'^ 

||a:||=l 11x11=1 


<^31 + J 32 , 


(B.IO) 


with 


Since 


J31 = max x{EIEi - ElEi)x^ = max y^XiXj ( 4 i - £ki) 
||x||=i lhll=i 

i,j k=l 


£ki 


P T \ 1/2 / P T ^1/2 

^ j ■ ( 


k=l i=l 


'kj 


k=l j=l 
1/2N 


p T 

^11 ~ 

k=l i=l 


p T P T . 

^ ~ \‘2 ^ £ki ~ ^£ki 

/ . / . ~ ~ z_^ z_^ I 


k=l i=l 
p T 


k=l i=l 


t^ki 


EE(i 

k=l i=l 




^ki 


(B.ll) 




< max O [ pT ■ ( sup 

ki 


1 - 

^ki 


, O \ pT ■ i sup 


ki 


^Eki 


O ( pT ■ sup 

ki 


1- 

^ki 


■ sup |E 4 i| 


ki 


^ , I '5V / / '5V / f iV 

< max < o I ^ j , o J , o 


(B.12) 


where the last inequality is due to (B. 8 ) and (B.9). Then according to (B.ll), we have the 
bound for the term J 31 : 


I J31I < max <; o ( , o J , o 


/ S'/Sp 


(B.13) 
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For the same reason, we have the term | J 32 I can be bounded by (B.13) as well. 
Therefore, we have 


iJh = 


— {EfE,-EfE^)EjE2 


1 

< - 
P 


El El-El El 


1 

op T 


op 

2 -^2 


eTe^ 


op 


— o ( -(J31 + J32) 
\p 


< max < o 




5V~5 


'J'v/2 j ’ J ' I 

Similar, we also have | J 4 I —?• 0, which leads to the fact that 


^ 0 


■^max 


(.5) Amax 


(B) 


0 


(B.14) 
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